Distributed API Protocol Mining
نویسندگان
چکیده
Dynamic Protocol Mining (DPM) techniques are a promising approach to infer useful API protocols automatically. However, their results are biased to input test cases and the instrumentation overhead discounts their usability in industrial practice. In this paper, we propose a distributed dynamic protocol mining framework NSpecMiner. Our framework is based on a client-server architecture, where the client tracer gathers Program Execution Traces (PETs) and sends them to the server for mining. Mined protocols are saved on the server to provide various kinds of remote services, such as API protocol retrieval and program verification, etc. Compared with local miners, NSpecMiner has many advantages: 1) A large number of diverse PETs are likely to be collected from multiple clients, which is essential for mining accurate and complete API protocols. 2) Instrumentation overhead can be balanced among multiple clients. 3) Via integrating the client tracer into widely used software, we can mine API protocols transparently and automatically without any human effort. To evaluate our technique, we performed a comparison test with a local miner ISpecMiner and NSpecMiner. Preliminary results show that our approach is effective to mine useful API protocols as local miners. While our method is able to gather PETs concurrently from multiple clients and other merits of the distributed technology will further benefit DPM
منابع مشابه
An Api for Transparent Distributed Vertical Data Mining
New data mining tools and algorithms are available for vertical data mining communities for scalable and efficient data mining to discover the hidden nuggets from huge repositories of data. Most of the traditional data mining algorithms do not scale on these huge datasets. This is due to insufficient computational resources, currently available on a single machine for running these applications...
متن کاملThe WebSocket API as supporting technology for distributed and agent-driven data mining
Supporting technologies play an important role in distributed data mining systems. The flexibility and the scalability of infrastructures and architectures can often determine the strength of a distributed data mining framework. In this paper we present some preliminary research work on a prototype for a distributed data miming framework. We shall show how the WebSocket API, which is a draft sp...
متن کاملExtracting More Object Usage Scenarios for API Protocol Mining
Automatic protocol mining is a promising approach to infer precise and complete API protocols. However, the effect of the approach largely depends upon the quality of input object usage scenarios, in terms of noise and diversity. This paper aims to extract as many object usage scenarios as possible from object-oriented programs for automatic protocol mining. A large corpus of object usage scena...
متن کاملAn Efficient Data Indexing Approach on Hadoop Using Java Persistence API
Data indexing is common in data mining when working with high-dimensional, large-scale data sets. Hadoop, a cloud computing project using the MapReduce framework in Java, has become of significant interest in distributed data mining. To resolve problems of globalization, random-write and duration in Hadoop, a data indexing approach on Hadoop using the Java Persistence API (JPA) is elaborated in...
متن کاملAn API for Distributed Reasoning on Networked Ontologies with Alignments
In this paper, we describe design and implementation of a Java interface for distributed reasoning on networked ontologies with alignments. This API is built over the standard OWLlink interface which is a communication protocol between OWL2 components. It is compatible with usual reasoners based on OWL such as Pellet and FaCT++ in centralized contexts. In this API, we have implemented an optimi...
متن کامل